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■ Abstract 

t— i ; 

Bauwens, Mahklin, Vereshchagin and Zimand [ 1 1 and Teutsch Q have shown that given a 
string x it is possible to construct in polynomial time a list containing a short description of it. We 
, simplify their technique and present a shorter proof of this result. 

in ■ 1 Introduction 

■ Given that the Kolmogorov complexity is not computable, it is natural to ask if given a string x it is 
posible to construct a short list containing a minimal (+ small overhead) description of x. Bauwens, 
Mahklin, Vereshchagin and Zimand CD and Teutsch [5] show that, surprisingly, the answer is YES. 
Even more, in fact the short list can be computed in polynomial time. More precisely, HI showed that 
one can effectively compute lists of quadratic size guaranteed to contain a description of x whose size 
is additively 0(1) from a minimal one (it is also shown that it is impossible to have such lists shorter 
than quadratic), and that one can compute in polynomial-time lists guaranteed to contain a description 
that is additively 0{\ogn) from minimal. Finally, [5] improved the latter result by reducing OiXogn) to 

o(i). 



> 
On 



Theorem 1 (0). For every standard machine U there is a constant c and a polynomial-time algorithm 
f such that for every x, f(x) outputs a list of programs that contains a c-short program for x.Q 

Let us explain the formal terms. Given a Turing machine U, a c-short program for x is a string 
p such that U(p) = x and the length of p is bounded by c+ (length of a shortest program for x). A 
machine U is optimal if Cu(x | y) < Cy(x \ y) + 0(1) for all machines V (where C is the Kolmogorov 
complexity and the constant 0(1) may depend on V). An optimal machine U is standard if for every 
machine V there is an efficient translator from any machine V to U, i.e., a polynomial-time computable 
function t such that for all p,y, U (t(p),y) = V(p,y) and \t(p)\ = \p\ + 0(1). 

Both [1] and [5 ] prove their results regarding polynomial-time computable lists as corollaries of 
somewhat more general theorems. We present in this note a direct proof of Theorem [Q which is simpler 
and shorter than the one in 0. We emphasize that there is no technical innovation in the proof that 
we present below. We use the same general approach and the same ingredients as in 0] and jQ, but, 
because we go straight to the target, we can take some shortcuts that render the proof simpler]^ 

Proof overview. Essentially we want to compress in polynomial time to (close to) minimal length, 
such that decompression is computable (not necessarily in poynomial time). This is of course impos- 
sible in absolute terms, but here we compress in a weaker sense, because we obtain not a single com- 
pressed string, but a list guaranteed to contain the (close to) optimally compressed string. It is natural 
to think to use seeded extractors, because an extractor's output is close to being optimally compressed 
in the Shannon entropy sense. The problem is that we need an extractor with logarithmic seed (because 
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we want a list of polynomial size) and no entropy loss (because we want to decompress). Unfortunately, 
such extractors have not yet been shown to exist. The key observation from [1 1, also used in 0, is that 
in fact a disperser is good enough, and then one can use the disperser from |@], which has the needed pa- 
rameters. Now, why are dispersers sufficient? The answer, inspired by 0, stems from the idea from [fl] 
to use for this kind of compression graphs that allow on-line matching. These are unbalanced bipartite 
graphs, which, in their simplest form, have LEFT = {0, l} n , RIGHT = {0, l}k+small overhead > and 
left degree = poly(n), and which permit on-line matching up to size K = 2 k . This means that any set 
A of K left nodes, each one requesting to be matched to some adjacent right node, can be satisfied in 
the on-line manner(i.e., the requests arrive one by one and each request is satisfied before seeing the 
next one; in our proof we will allow a small number of requests to be discarded, but this should also 
happen before the next request arrives). The correspondence to our problem is roughly that strings in 
LEFT are the strings that we want to compress, and the strings in RIGHT are their compressed forms. 
We need on-line matching because we are going to enumerate left strings as they are produced by the 
universal machine and each time a string is enumerated we want to find it a match, i.e., to compress 
it. In order for a graph to allow matching, it needs to have good expansion properties. It turns out that 
it is enough if left subsets of a given size K/0{\) expand to size K, and a disperser has this property. 
When we decompress, given the right node (the compressed string), we run the matching algorithm 
and see which left node has been matched to it. For this the decompressor needs to have n to be able 
to construct the graph, and this produces the 0(log«) overhead. Thus this approach is good enough to 
obtain the result with 0(log7i)-short programs from |Q]. To reduce 0(log7i) to 0(1), we need the new 
ideas from 0. The point is that this time we want LEFT to have strings not of a single length n, but of 
all lengths n > k (because we can no longer afford to give n to the decompressor). In fact, it is not hard 
to see, that it is enough to restrict to lengths k < n < 2 . This time we need expansion for all sets of 
size < K (not just equal to a fixed K/0(l), because we need each subset (of the match-requesting set 
A) of strings of a given length to expand. For this, the unbalanced lossless expander from [0 is good, 
except for one problem: The size of RIGHT in this expander is po\y(K) and not the desired K + 0(1). 
This problem is fixed by compressing using again the disperser from [4] to a set of size K -po\y(k), 
and, finally, using a simple trick, to size ^ + 0(1), which implies the 0(1) overhead we aim for. 

2 The proof 

We use bipartite graphs G = (L,R,E QLxR). We denote LEFT(G) = L, RIGHT(G) = R. For integers 
n,m,k,d we denote N = 2 k ,M = 2 m , K = 2 k ,D = 2 d . We denote [n] = {1,2, . . . ,n}. A bipartite graph 
G is explicit if there exists a polynomial-time algorithm that given x € LEFT(G) and i, outputs the i-th 
neighbor of x. 

Definition 1. A bipartite graph G is a (K,K')-expander if every subset of left nodes having size K, has 
at least K' right neighbors. 

Theorem 2 (Guruswami, Umans, Vadhan 0). For every constant a, every n, every k <n, and e > 0, 

there exists am explicit (K' , (1 —e)DK') expander for every K' < K, in which every left node has degree 
D = 0({nk/ey +l l a ),L=[N},R=[M},M<D 2 -K l+a . 

Definition 2. A bipartite graph G = (L,R,E) is a (K, 8) -disperser, if every subset B C L with \B\ > K 
has at least (1 — 8)\R\ distinct neighbors. 

Theorem 3 (Ta-Shma, Umans, Zuckerman J4)). For every K,n and constant 8, there exists explicit 
(K, 8) -dispersers G = {L = {0, 1}",R = {0, \} m ,E CLxS) in which every node in L has degree 

D = n 2°(( l °S^Sn) 2 ) md \ R \ _ qKD - SQme comtant a ^ 

The key combinatorial object that we use is provided in the following lemma. 

3 (4l only indicates that D = poly(«). The value D = rc2 ^ loglog ")~) is obtained by reworking the proof in Lemma 6.4 [4] 
using the extractor with constant loss from Theorem 4.21 in ||2)- 
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Lemma 4. For every constant c and every sufficiently large k, there exists an explicit bipartite graph 
Hk with the following properties: 

1. LEFT(H k ) = {0, l} k U {0, l} k+1 U . . . U {0, 1} 2 \ RIGHT(H k ) = {0, l} k+1 , 

2. Each left node x has degree poly(|x|), 

3. Hj, is a (K/c ,K)-expander. 

We defer the proof of this lemma for later. 

We show how the lemma implies Theorem [T] We start with the following lemma about on-line 
matching (recall that this means that one receives a sequence of requests to match left nodes with one 
of their adjacent right nodes and each request must be satisfied, or discarded, before seeing the next 
one). 

Lemma 5. If K on-line matching requests are made in a (K / 'c 2 ,K)- expander all but less than K/c 2 
can be satisfied. 

Proof. Suppose there are K requests for matching left nodes and we attempt to satisfy them in the 
obvious greedy manner. Suppose that K/c 2 requests cannot be satisfied (because all their neighbors 
have been used to match previous requests). The K/c 2 left nodes that are not satisfied have K right 
neighbors and all of them have satisfied matching requests. This would imply that all the K requests 
have been satisfied, contradiction. □ 

Proof of Theorem [TJ 

We define the following machine V ("the decompressor"). 

(1) On inputs of the form OOp, V outputs p. 

(2) On inputs of the form 0\p, V simulates U (p) and if U (p) = x and \x\ > 2^, outputs x. 

(3) On inputs of the form Ip, V works as follows: 

V calculates its value on all inputs of the form lp' with \p'\ = \p\ as follows. Let k = \p\ — 1. 
Enumerate the elements of the set {x \ 3q of length k, U (q) = x}. When an element x is enumerated 
and |x| is between k and 2 k , pass x to the online matching algorithm for H^. If x is matched to p', 
then V(p') outputs x. If x is rejected because all its right neighbors in have already been used to 
match other elements during the computation of V(\p') for strings p' of length k— 1, continue the 
enumeration. 



Observe that during computations of the form (3), at most K matching requests are made and 
therefore, by the property of Hk, there are fewer than K/c 2 rejections. It follows that if v is a rejected 
node then Cjj(v) <k — 2 logc + logc + 2 log log c + 0(1) < k, for c a large enough constant. Indeed a 
rejected string can be described by its index in the set of rejected strings written on exactly k — 2 logc 
bits, and c (which is needed in order to reconstruct k and next enumerate the set of rejected strings). The 
additional 2 log logc term is required for concatenating the index and c. It follows that if x is a string 
such that Cu(x) = k and k G {log \x\, . . . , \x\}, then there exists p of length k + 1 such that V(lp) = x. 
Moreover, p is one of the right neighbors of x in H^. 

Now, for each x, let list(x) be the list containing the following strings: OOx, all strings of length < 
log \x\ prefixed with 01, and all the neighbors of xin/4 prefixed with a 1, for k= \x\, \x\ — 1, . .. , log(|jc|). 
Note that for every x, list(x) can be computed in polynomial time, and there exists v € list{x), \v\ < 
Cu{x) + 0(1) such that Cy(v) = x. Finally, using the "translator" t from V programs to U programs, 
take f(x) = {t (v) | v G list(x)}. Since t is computable in polynomial time, U (t(v)) = V(v) and \t(v) \ = 
\v\ + 0(1), we are done. □ 

It remains to prove Lemma[4] We use two types of graphs given in the following two lemmas. 
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Lemma 6. For every n, and k < n, there exists a bipartite graph GUV„ t k with each left node having 
degree D = X(nk) 2 (for some fixed constant X), LEFT(GUV njk ) = {0, 1}", RIGHT(GUV nk ) = [M] 
with M < D 2 K 2 , which is a (K 1 , (\/2)DK')-expander for every K' < K. 

Proof. This is the Guruswami, Umans, Vadhan expander with parameters a = 1, e = 1/2. □ 

Lemma 7. For every k, there exists a bipartite graph F^ with each left node having degree D = 0(k ), 
LEFT(F k ) = {0, l} 8k , RIGHT(F k ) = {0, l} k+1 , which is a (K,K)-expander. 



8k 



Proof. Consider the Ta-Shma, Umans, Zuckerman (K, l/2)-disperser G, with LEFT(G) = {0,1} 
RIGHT(G) = {0, l} m , left degree D = 0{k2°^ Xo ^) and |RIGHT(G)| = g^. 

To increase the size of the right set to be at least 2K, we make RIGHT consist of 2[^^-] copies 
of RIGHT(G) connected to LEFT(G) in the same way as the original nodes. Thus each right node is 
labelled by a string of length > k + 1 and the left degree is 0(k 3 ). 

By merging the nodes whose labels have the same prefix of length k+l, we obtain the graph Fk, 
which as desired has RIGHT(F k ) = {0, l} k+1 and is a (K, 1 /2)-disperser (because the merge operation 
can only improve the dispersion property). 

Thus, every left subset of size K has at least (1/2) • 2K right neighbors, i.e., Fk is a (/T,^)-expander. 

□ 

We are now prepared to prove Lemma @] 
Proof of Lemma 3] 

Let us fix c and a sufficiently large k. 

We first construct the graph Gk as the union GUVk.k U GUVk+\,k U . . . U GUV 2 t t k- 

Note that LEFT(G k ) consists of all strings having length between k and 2 k . For RIGHT(G k ), we 

shift the numerical labels of the right nodes in each set in the obvious way before taking the union, so 

that the sets that we union are pairwise disjoint. We have 

RIGHT (G k )[ < £ A 2 (nk) 4 K 2 = A 2 k 4 K 2 £ n 4 < A 2 k 4 • K 7 < K 8 , 

n=k n=k 

for k sufficiently large. By padding each right node in Gk with 100 ... 0, we label each right node by a 
string of length %k. 

Note that, provided k is sufficiently large, Gk is a (/f/c^A^-expander. Indeed take B C LEFT(G k ), 
|B| = K/c 2 . B has strings of different lengths. If we partition B into subsets of strings corresponding to 
the different lengths, each subset with strings of length say n expands according to GUV n ^ by a factor 
of (1/2) A (nk) 2 > c (if k is large enough). Since different subsets of the partition map into disjoint 
right subsets, the above assertion follows. 

The degree of every left node x in Gk is bounded by poly (\x\) because the edges originating in x are 
those from the graph GUVua^. So Gk is almost what we need except that the right nodes have length 8& 
instead of k + 1 . We fix this issue by compressing strings of length 8& to length k+l using the graph 
Fk from Lemma [7] 

More precisely, we build the graph Hk by taking the product of the above graph Gk with the graph 
F k . Thus LEFT(H k ) = LEFT(G k ), RIGHT(H k ) = RIGHT(F k ) and (x,y) is an edge in H k if there exists 
z £ RIGHT(G k ) C LEFT(F k ) such that (x,z) is an edge in Gk and (z,y) is an edge in Fk. As desired, 
LEFT(H k ) consists of all strings x having length between k and 2 k , RIGHT (H k ) = {0, l} k+1 , the degree 
of every left node x is bounded by poly(|;t|)poly(Z:) = poly (|x|) and Hk is a (K/c 2 ,K) -expander, because 
each left subset of size K/c expands to size at least K in Gk and then it keeps its size at least K when 
passing through F k . □ 

Note. The above construction yields in Theorem Q] a list of size 0(n H ). If in Lemma [6] we take a 
small a (instead of a = 1), we obtain list size n 6+s , for arbitrarily small positive constant 8. 
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